NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Algorithm Selection for Deep Active Learning with Imbalanced Datasets

Zhang, Jifan; Shao, Shuai; Verma, Saurabh; Nowak, Robert (September 2023, Advances in Neural Information Processing Systems)

Label efficiency has become an increasingly important objective in deep learning applications. Active learning aims to reduce the number of labeled examples needed to train deep networks, but the empirical performance of active learning algorithms can vary dramatically across datasets and applications. It is difficult to know in advance which active learning strategy will perform well or best in a given application. To address this, we propose the first adaptive algorithm selection strategy for deep active learning. For any unlabeled dataset, our (meta) algorithm TAILOR (Thompson ActIve Learning algORithm selection) iteratively and adap- tively chooses among a set of candidate active learning algorithms. TAILOR uses novel reward functions aimed at gathering class-balanced examples. Extensive experiments in multi-class and multi-label applications demonstrate TAILOR ’s effectiveness in achieving accuracy comparable or better than that of the best of the candidate algorithms. Our implementation of TAILOR is open-sourced at https://github.com/jifanz/TAILOR.
more » « less
Full Text Available
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

https://doi.org/10.52202/079017-3978

Chen, Jiayi; Guo, Yang; Jain, Lalit; Jamieson, Kevin; Mankoff, Robert; Nowak, Robert; Rogers, Timothy; Sievert, Scott; Suresh, Siddharth; Wagenmaker, Andrew; et al (January 2024, Neural Information Processing Systems Foundation, Inc. (NeurIPS))

Full Text Available
GALAXY: Graph-based Active Learning at the Extreme

Zhang, Jifan; Katz-Samuels, Julian; Nowak, Robert (January 2022, International Conference on Machine Learning)

Active learning is a label-efficient approach to train highly effective models while interactively selecting only small subsets of unlabelled data for labelling and training. In "open world" settings, the classes of interest can make up a small fraction of the overall dataset -- most of the data may be viewed as an out-of-distribution or irrelevant class. This leads to extreme class-imbalance, and our theory and methods focus on this core issue. We propose a new strategy for active learning called GALAXY (Graph-based Active Learning At the eXtrEme), which blends ideas from graph-based active learning and deep learning. GALAXY automatically and adaptively selects more class-balanced examples for labeling than most other methods for active learning. Our theory shows that GALAXY performs a refined form of uncertainty sampling that gathers a much more class-balanced dataset than vanilla uncertainty sampling. Experimentally, we demonstrate GALAXY's superiority over existing state-of-art deep active learning algorithms in unbalanced vision classification settings generated from popular datasets.
more » « less
Full Text Available
Improved Algorithms for Agnostic Pool-based Active Classification

Katz-Samuels, Julian; Zhang, Jifan; Jain, Lalit; Jamieson, Kevin (January 2021, Proceedings of Machine Learning Research)

Full Text Available

Search for: All records